Human behavior emerges from planning over elaborate decompositions of tasks into goals, subgoals, and low-level actions. How are these decompositions created and used? Here, we propose and evaluate a normative framework for task decomposition based on the simple idea that people decompose tasks to reduce the overall cost of planning while maintaining task performance. Analyzing 11,117 distinct graph-structured planning tasks, we find that our framework justifies several existing heuristics for task decomposition and makes predictions that can be distinguished from two alternative normative accounts. We report a behavioral study of task decomposition ($N=806$) that uses 30 randomly sampled graphs, a larger and more diverse set than that of any previous behavioral study on this topic. We find that human responses are more consistent with our framework for task decomposition than alternative normative accounts and are most consistent with a heuristic -- betweenness centrality -- that is justified by our approach. Taken together, our results provide new theoretical insight into the computational principles underlying the intelligent structuring of goal-directed behavior.
translated by 谷歌翻译
We consider a sequential decision making task where we are not allowed to evaluate parameters that violate an a priori unknown (safety) constraint. A common approach is to place a Gaussian process prior on the unknown constraint and allow evaluations only in regions that are safe with high probability. Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case. Moreover, the way in which they exploit regularity assumptions about the constraint introduces an additional critical hyperparameter. In this paper, we propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate. Our approach is naturally applicable to continuous domains and does not require additional hyperparameters. We theoretically analyze the method and show that we do not violate the safety constraint with high probability and that we explore by learning about the constraint up to arbitrary precision. Empirical evaluations demonstrate improved data-efficiency and scalability.
translated by 谷歌翻译
Privacy-preserving machine learning in data-sharing processes is an ever-critical task that enables collaborative training of Machine Learning (ML) models without the need to share the original data sources. It is especially relevant when an organization must assure that sensitive data remains private throughout the whole ML pipeline, i.e., training and inference phases. This paper presents an innovative framework that uses Representation Learning via autoencoders to generate privacy-preserving embedded data. Thus, organizations can share the data representation to increase machine learning models' performance in scenarios with more than one data source for a shared predictive downstream task.
translated by 谷歌翻译
ICECUBE是一种用于检测1 GEV和1 PEV之间大气和天体中微子的光学传感器的立方公斤阵列,该阵列已部署1.45 km至2.45 km的南极的冰盖表面以下1.45 km至2.45 km。来自ICE探测器的事件的分类和重建在ICeCube数据分析中起着核心作用。重建和分类事件是一个挑战,这是由于探测器的几何形状,不均匀的散射和冰中光的吸收,并且低于100 GEV的光,每个事件产生的信号光子数量相对较少。为了应对这一挑战,可以将ICECUBE事件表示为点云图形,并将图形神经网络(GNN)作为分类和重建方法。 GNN能够将中微子事件与宇宙射线背景区分开,对不同的中微子事件类型进行分类,并重建沉积的能量,方向和相互作用顶点。基于仿真,我们提供了1-100 GEV能量范围的比较与当前ICECUBE分析中使用的当前最新最大似然技术,包括已知系统不确定性的影响。对于中微子事件分类,与当前的IceCube方法相比,GNN以固定的假阳性速率(FPR)提高了信号效率的18%。另外,GNN在固定信号效率下将FPR的降低超过8(低于半百分比)。对于能源,方向和相互作用顶点的重建,与当前最大似然技术相比,分辨率平均提高了13%-20%。当在GPU上运行时,GNN能够以几乎是2.7 kHz的中位数ICECUBE触发速率的速率处理ICECUBE事件,这打开了在在线搜索瞬态事件中使用低能量中微子的可能性。
translated by 谷歌翻译
深度卷积神经网络(DCNNS)在面部识别方面已经达到了人类水平的准确性(Phillips等,2018),尽管目前尚不清楚它们如何准确地区分高度相似的面孔。在这里,人类和DCNN执行了包括相同双胞胎在内的具有挑战性的面貌匹配任务。参与者(n = 87)查看了三种类型的面孔图像:同一身份,普通冒名顶替对(来自相似人口组的不同身份)和双胞胎冒名顶替对(相同的双胞胎兄弟姐妹)。任务是确定对是同一个人还是不同的人。身份比较在三个观点区分条件下进行了测试:额叶至额叶,额叶至45度,额叶为90度。在每个观点 - 差异条件下评估了从双胞胎突变器和一般冒险者区分匹配的身份对的准确性。人类对于一般撞击对比双重射手对更准确,准确性下降,一对图像之间的观点差异增加。通过介绍给人类的同一图像对测试了经过训练的面部识别的DCNN(Ranjan等,2018)。机器性能反映了人类准确性的模式,但除了一种条件以外,所有人的性能都处于或尤其是所有人的表现。在所有图像对类型中,比较了人与机器的相似性得分。该项目级别的分析表明,在九种图像对类型中的六种中,人类和机器的相似性等级显着相关[范围r = 0.38至r = 0.63],这表明人类对面部相似性的感知和DCNN之间的一般协议。这些发现还有助于我们理解DCNN的表现,以区分高度介绍面孔,表明DCNN在人类或以上的水平上表现出色,并暗示了人类和DCNN使用的特征之间的均等程度。
translated by 谷歌翻译
由能够连接和交换消息的越来越多的移动设备而激励,我们提出了一种旨在模拟和分析网络中节点移动性的方法。我们注意到文献中的许多现有解决方案依赖于直接在节点联系人图表上计算的拓扑测量,旨在捕获节点在有利于原型设计,设计和部署移动网络的连接和移动模式方面的重要性。但是,每个措施都具有其特异性,并且无法概括最终随时间变化的节点重要性概念。与以前的方法不同,我们的方法基于节点嵌入方法,该方法模型和推出在保留其空间和时间特征的同时在移动性和连接模式中对节点的重要性。我们专注于基于一丝小组会议的案例研究。结果表明,我们的方法提供了提取不同移动性和连接模式的丰富表示,这可能有助于移动网络中的各种应用和服务。
translated by 谷歌翻译
我们制定了一种评估给定两组图像的生成网络性能的度量。当前用于执行此操作的流行绩效指标是Fr \'Echet Inception距离(FID)。 FID假设使用Inception-V3的倒数第二层遵循高斯分布来特征的图像,如果我们希望将FID用作度量标准,则不会违反这种假设。但是,我们表明,ImakeNet数据集的Inception-V3特征不是高斯。特别是,每个边缘都不是高斯。为了解决这个问题,我们使用高斯混合模型(GMM)对特征图像进行建模,并计算限于GMM的2-Wasserstein距离。我们通过使用Inception-V3(或其他分类器)在两组图像上定义了一个称为WAM的性能度量,以表征图像,估算两个GMM,并使用受限的$ 2 $ - WASSERSTEIN距离比较GMMS。我们通过实验表明WAM比FID的优势,包括FID比WAM对不可察觉的图像扰动更敏感。通过建模从Inception-V3作为GMM获得的非高斯特征并使用GMM度量,我们可以更准确地评估生成网络性能。
translated by 谷歌翻译
最近的一些研究描述了深层卷积神经网络,以诊断与人类专家相似甚至卓越表现的乳腺癌乳腺癌。最好的技术之一可以进行两种转移学习:第一个使用在自然图像上训练的模型来创建“补丁分类器”,该模型将小型子图表分类;第二个使用补丁分类器来扫描整个乳房X线照片并创建“单视图全图分类器”。我们建议进行第三次转移学习,以获取“两视图分类器”,以使用两种乳房X线摄影视图:双侧颅颅和中外侧倾斜。我们使用效率网络作为模型的基础。我们使用CBIS-DDSM数据集“端到端”训练整个系统。为了确保统计鲁棒性,我们使用以下方式两次测试系统,(a)5倍交叉验证; (b)数据集的原始培训/测试部门。我们的技术使用5倍的交叉验证达到0.9344的AUC(在ROC的误差率相等的误差率下,准确性,灵敏度和特异性为85.13%)。据我们所知,使用原始的数据集除法,我们的技术达到了0.8483,尽管我们知道的最高的AUC在此问题上,尽管每项工作的测试条件上的细微差异不允许进行准确的比较。推理代码和模型可在https://github.com/dpetrini/two-views-classifier上获得
translated by 谷歌翻译
Existing automated techniques for software documentation typically attempt to reason between two main sources of information: code and natural language. However, this reasoning process is often complicated by the lexical gap between more abstract natural language and more structured programming languages. One potential bridge for this gap is the Graphical User Interface (GUI), as GUIs inherently encode salient information about underlying program functionality into rich, pixel-based data representations. This paper offers one of the first comprehensive empirical investigations into the connection between GUIs and functional, natural language descriptions of software. First, we collect, analyze, and open source a large dataset of functional GUI descriptions consisting of 45,998 descriptions for 10,204 screenshots from popular Android applications. The descriptions were obtained from human labelers and underwent several quality control mechanisms. To gain insight into the representational potential of GUIs, we investigate the ability of four Neural Image Captioning models to predict natural language descriptions of varying granularity when provided a screenshot as input. We evaluate these models quantitatively, using common machine translation metrics, and qualitatively through a large-scale user study. Finally, we offer learned lessons and a discussion of the potential shown by multimodal models to enhance future techniques for automated software documentation.
translated by 谷歌翻译
While the capabilities of autonomous systems have been steadily improving in recent years, these systems still struggle to rapidly explore previously unknown environments without the aid of GPS-assisted navigation. The DARPA Subterranean (SubT) Challenge aimed to fast track the development of autonomous exploration systems by evaluating their performance in real-world underground search-and-rescue scenarios. Subterranean environments present a plethora of challenges for robotic systems, such as limited communications, complex topology, visually-degraded sensing, and harsh terrain. The presented solution enables long-term autonomy with minimal human supervision by combining a powerful and independent single-agent autonomy stack, with higher level mission management operating over a flexible mesh network. The autonomy suite deployed on quadruped and wheeled robots was fully independent, freeing the human supervision to loosely supervise the mission and make high-impact strategic decisions. We also discuss lessons learned from fielding our system at the SubT Final Event, relating to vehicle versatility, system adaptability, and re-configurable communications.
translated by 谷歌翻译